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Abstract: Tenebrio molitor is a well known model insect. Although a lot of achievements have been made 
in many research aspects related to this insect, only very few molecular/genetic resources are available. 
In this study, a high-throughput method was used for discovering the simple sequence repeat ( SSR ) 
genetic markers from this beetle. In total, 1 249 SSR genetic markers were developed from the previously 
constructed trancriptome database. The majority of them contained mono- and trinucleotide motifs 
(44.44% and 41.15% , respectively), and A/T (42.70% ) was the most abundant motif. Except for 
mononucleotide, the SSRs with five repeat units were the most common, with the frequency of 30.90%. 
Base on the identified SSRs, 1 004 pairs of primers were designed, of which a maximum of 5 pairs of 
alternative primers were designed from a single SSR. The SSRs identified here will constitute an important 
resource for marker-assisted investigation in functional and comparative genomics of T. molitor. 
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1 INTRODUCTION 


( SSRs ) or 


microsatellites, ubiquitously distributed throughout 


Simple sequence repeats 
eukaryotic and prokaryotic genomes, are tandem 
repeat sequences of 1 - 6 base pairs of DNA 
( Goldstein and Schlétterer, 1999 ). With the 
advantages of harboring high levels of polymorphism , 
being stable, PCR-based and relatively low-cost, 
SSRs have been developed and become one of the 
most popular genetic markers widely used in many 
of molecular 


areas biology such as genome 
characterization, genome mapping, comparative 
genomics, phylogenetic studies and population 


genetics (Li et al., 2002, 2004). Traditionally, 
SSR marker development has typically involved in 
constructing and screening recombinant libraries, 
which is generally laborious, time consuming and 
expensive (Zane et al., 2002; Tang et al., 2008). 
With the advent of next generation sequencing 
technologies (a powerful alternative for generating a 
tremendous number of DNA sequences ) high- 
throughput transcriptome sequencing dramatically 


expedites the traditional methods of developing SSR 
markers ( Wei et al., 2011; Churbanov et al., 2012; 





Gao et al., 2012). 

The yellow mealworm beetle, Tenebrio molitor 
( Coleoptera: Tenebrionidae ), is considered 
scavengers that infest stored products including 
cereals, baking flour, and livestock feed. It is 
thought to have originated in Europe and now 
widespread over the world. It is one of excellently 
ideal model organisms among the higher eukaryotes 
for studies in biology, biochemistry, evolution, 
immunology and physiology because of the reasons 
such as its relatively large size, ease of rearing and 
handling, and genetic diversity ( Pursall and Rolff, 
2011; Dobson et al., 2012). Since Arenssen Hein 
started experiments with T. molitor on a large scale 
in 1915, a large amount of information related to this 
insect has been gathered in numerous research fields 
( Arendsen Hein, 1920; Andersen et al., 1997; 
Pölkki et al., 2012 ). Despite the 


investigation of 7. molitor has a long history, only 


focused 


very few molecular genetic/genomic resources are 
available. The transcript dada set of this insect has 
been generated using Illumina HiSeq™ 2000 platform 
and a large amount of assembled unigenes are 
available ( Zhu et al., 2012). Based on this 
database, a mega-identification of SSRs in T. 
molitor was reported in this study. 
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2 MATERIALS AND METHODS 


2.1 Transcriptome database 

The transcriptome of T. molitor pupae was 
sequenced using Illumina paired-end sequencing 
(Zhu et al., 2012). Raw data were deposited in 
DDBJ database under accession no. DRA000603. 
The unigenes assembled using SOAPdenovo software 
were used for SSR identification. 
2.2 SSR identification 

A perl script, MIcroSAtelitte identification tool 
( MISA, http: //perc. ipk-gatersleben. de/misa/ ) 
was used to identify SSRs in all these unigene 
sequences. The parameters were designed for 
identification of mono-, di-, tri-, tetra-, penta-, and 
hexa-nucleotide SSR motifs with the minimum of 


4 and 4, 


respectively. The maximum length of interruption 


repeat numbers of 12, 6, 5, 5, 


between two adjacent SSR repeat units was set to 100 
bp. 
2.3 SSR primer design and validation 

With respect to the unigenes containing SSRs, 
only those with sequences not less than 150 bp in 
both back and forth of the repeat units were screened 
out for primer design. The primers were designed 
using the Primer 3 v2.2.2 software with length of 18 
- 23 bp (optimum length: 23 bp), annealing 
temperature of 55 - 65°C ( optimum annealing 
temperature; 60°C ; maximum difference in temperature 
of forward and reverse primer; 2°C ) , and the product 
size ranging from 80 to 300 bp. Then, the following 
selection criteria were applied: primers can not contain 
an SSR (2 -6 bases repeat more than four times) , and 
primers have only one matched unigene when mapped 
against all unigenes (allow three-base mismatch at the 
5’ end and one-base mismatch at the 3’ end). To 
validate the success of the SSR primer design, 12 
primers were randomly selected and their products were 
amplified by PCR using T. molitor genomic DNA as the 
template. The PCR reaction profile was; 94°C for 2 
min, followed by 30 cycles of 94°C for 20 s, 55°C for 
30 s, 72°C for 1 min, and a final extension step at 72°C 
for 10 min. The PCR amplification products were 
analyzed by electrophoresis in 4% agarose gels. 


3 RESULTS 


A total of 71 514 unigenes composed of 30 319 407 
nucleotides were examined for identifying SSRs. Among 
these sequences, a subset of 1 190 unigenes containing 
1 249 SSRs were found, suggesting that merely 1.67% 


of sequences contained SSRs. These motifs included 
mono-, di-, tri-, tetra-, penta- and hexa-nucleotides 
with the lengths ranging from 1 to 6 bp. Fifty five 
unigenes contained more than one SSR. The majority of 
SSRs was in perfect formation. There were only 30 
SSRs present in compound formation, accounting for 
2.4% of the total. The numbers of SSRs with different 
numbers of tandem repeats were calculated. Of the 
1 249 SSR motifs identified, mono-, di-, tri-, tetra-, 
penta- and hexanucleotide repeats were 555, 140, 514, 
21, 12 and 7 in total, respectively. Based on the 
distribution of the SSRs, mononucleotide repeats were 
found to be the most abundant (44.44% ) , followed by 
trinucleotide (41. 15% ) dinucleotide (11. 21% ) , 
tetranucleotide ( 1.68% ),  pentanucleotide 
(0.96% ) and hexanucleotide repeats (0. 56% ) 
(Table 1). Within mononucleotide, the major SSRs 
were with 12 and > 15 repeat units. Except for 
SSRs repeat 
(30.90% ) were the most common, followed by six 
(15.21%), seven (5.12% ), eight (1. 84%), 
and four (1. 04%) repeat units. Among all 
nucleotide repeats, A/T (42. 70% ) was the most 
abundant motif (Table 2). Regarding to di- and tri- 
and tetranucleotide repeats, only CG/CG, ACT/ 
AGT, AGG/CCT, and AAAT/ATTT displayed 


relatively low abundance. 


mononucleotide, with five units 


The other types were 
nearly equal. Using the Primer 3 v2.2.2 software, a 
total of 1 480 primer pairs were designed. After 
filtration, 1 004 primer pairs were successfully 
obtained, of which a maximum of 5 pairs of 
alternative primers were designed from a single SSR. 
Twelve primer pairs were randomly selected for 
validation of the predicted SSRs. Of them, all 
successfully amplifed PCR products (Table 3). 


4 DISCUSSION 


As the powerful applications of SSRs, using 
microsatellite markers to understand the evolution 
process and diversification of insects may help us 
protect useful insects and control pests ( Wang et al., 
2009). Large numbers of SSRs have been isolated 
and characterized in different insect orders. 
Abundance of SSRs, with which they occur differ 
greatly between taxa, was known to be dependent on 
the SSR search criteria, the size of the dataset, and 
the database-mining tools ( Varshney et al., 2005). 
In this study, 1.67% sequences of the transcriptome 
contained the SSRs . 
to the SSR contents in the genomes of Bombyx mort, 
Drosophila melanogaster, Anopheles gambiae, Apis 


The percent number is similar 


726 昆虫 学 报 Acta Entomologica Sinica 56 卷 
Table 1 Frequency of SSRs based on the number of repeat units in Tenebrio molitor transcriptome 
Number of repeats 
SSR type Total Yo 
5 6 7 8 9 10 11 12 13 14 15 >15 
Mononucleotide 一 一 一 一 一 一 一 一 183 97 67 38 170 555 44.44 
Dinucleotide 一 一 78 38 14 3 4 一 1 2 一 一 一 140 11.21 
Trinucleotide 一 364 107 26 9 5 2 1 一 一 一 一 一 514 41.15 
Tetranucleotide 一 16 5 一 一 一 一 一 一 一 一 一 一 21 1.68 
Pentanucleotide 8 4 一 一 一 一 一 一 一 一 一 一 一 12 0.90 
Hexanucleotide 5 2 一 一 一 一 一 一 一 一 一 一 一 7 0.56 
Total 13 386 190 64 23 8 6 1 184 99 67 38 170 
% 1.04 30.90 15.21 5.12 1.84 0.64 0.48 0.08 14.73 7.93 5.36 3.04 13.61 
Table 2 Frequency distribution of SSRs based on motif types in Tenebrio molitor transcriptome 
Number of repeats 
SSR motif Total % 
4 5 6 7 8 9 10 11 12 13 14 15 >15 
A/T 一 一 一 一 一 一 一 一 178 90 63 37 161 529 42.70 
C/G 一 一 一 一 一 一 一 一 5 7 4 1 9 26 2.10 
AC/GT 一 一 20 14 11 1 一 一 1 2 一 一 一 49 3.95 
AG/CT 一 一 28 14 3 1 4 一 一 一 一 一 一 50 4.04 
AT/AT 一 一 28 10 一 1 一 一 一 一 一 一 一 39 3.15 
CG/CG 一 一 2 一 一 一 一 一 一 一 一 一 一 2 0.16 
AAC/GTT 一 48 16 2 1 2 一 1 一 一 一 一 一 70 5.65 
AAG/CTT 一 52 15 3 一 一 1 一 一 一 一 一 一 71 5.73 
AAT/ATT 一 60 31 8 5 1 一 一 一 一 一 一 一 105 8.47 
ACC/GGT 一 23 7 4 一 一 一 一 一 一 一 一 一 34 2.74 
ACG/CGT 一 56 8 3 一 一 一 一 一 一 一 一 一 67 5.41 
ACT/AGT 一 4 4 一 一 1 一 一 一 一 一 一 一 9 0.73 
AGC/CTG 一 30 4 1 1 一 一 一 一 一 一 一 一 36 2.91 
AGG/CCT 一 13 4 一 1 1 一 一 一 一 一 一 一 19 1.53 
ATC/ATG 一 29 8 2 一 一 1 一 一 一 一 一 一 40 3.23 
CCG/CGG 一 49 10 3 1 一 一 一 一 一 一 一 一 63 5.08 
AAAT/ATTT 一 9 2 一 一 一 一 一 一 一 一 一 一 11 0.89 
Others 13 6 一 一 一 一 一 一 一 一 一 一 一 19 1.53 
The sequence complementary was considered. 
mellifera, and Tribolium castaneum, which are the number of large designed SSR primer pairs 


0.72% , 1.56%, 1.58%, 3.41%, and 0. 41%, 
respectively ( Archak et al., 2007). With regard to the 
SSR formation of T. molitor, it is similar to the above 
five insect genomes that account for nearly 3.2% 
compound SSRs ( Archak et al., 2007). In general, 
trinucleotide repeats the abundant 
microsatellites in coding ESTs besides mononucleotide 


are most 
repeats (Xu et al., 2012). Trinucleotide repeats were 
predominant in T. molitor transcriptome sequences. 
On the basis of the detected SSR marker sequences, 


indicated that transcriptome data is a fast and cost- 
effective way for high-throughput discovery of SSR 
markers in non-model organisms that lack a reference 
The randomly selected primer pairs all 
successfully amplified PCR products, which highly 
validated the utility of predicted SSRs. The SSR data 
constructed here will facilitate the development of 
suitable SSR genetic markers for investigating the 
functional and comparative genomics of T. molitor. 


genome. 
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Table 3 The characterization of selected SSR primers 


Product size (bp) 


Motif and repeat number 


727 


PCR amplification 


Number Unigene ID Primer sequences 
1 Unigene10146 F; CCTTCGTCTTCGCCTTGAG 
R; AAGGAACTGTCCCAGAAGCTC 
2 Unigenel 1122 F; ACGAAACAAAAACCACAG 
R: TTTTAACATGAATGTTGGCTTCA 
3 Unigene1309 F; GATCACGATCCTGGTCTTCC 
R; TCTGATTATTAGACAACGCACGA 
4 Unigene13708 F; AACGAGATGGTGCTGGAATC 
R; CATCCCTAGCAGACTCGCA 
5 Unigene1 4044 F; GTACCTGTATTGCTGAAACCGTC 
R; CGTAACATGGAGTGTCGTCGTAT 
6 Unigenel7249 F: ACTCGTATGTCCCTTTGTCCTIT 
R; GACAACGAAATCACTCTGGAGAT 
7 Unigenel 7273 F; CTTCTCTGTCCATTTAGCACCAT 
R; GAAGTCCGTGTTGTGTGTTGTAA 
8 Unigene1769 F; AGCTGGTGAACGACAGCAAC 
R: GITTTGGTTGAAGTACGTCGTG 
9 Unigene63254 F: CTTCTTCTCTTTCTCCGGTTTIT 
R; GACGAACTCGATGAGCAACCT 
10 Unigene5676 F; ACAATCTCAACAAGAGGAAGTGC 
R: GTGGAATTCTCCGAAGGTTCTT 
11 Unigene64074 F; CACGCCGAAGTACCTACAAATAA 
R; TGGGAATAAACCATAAAAACCAA 
12 Unigene8285 F: GTACCTATGGCGTTCTITACACG 
R; TACACTCTTAGGGTCTGGGCATA 


S: Successful amplification. 


ACKNOWLEDGEMENTS We are grateful to YANG Yu-Zhi 
at Southwest Forestry University, China for conducting the SSR 
primers validation. 


References 


Andersen SO, Rafn K, Roepstorff P, 1997. Sequence studies of proteins 
from larval and pupal cuticle of the yellow meal worm, Tenebrio 
molitor. Insect Biochem. Mol. Biol., 27(2), 121 -131. 

Archak S, Meduri E, Kumar PS, Nagaraju J, 2007. InSatDb: a 
microsatellite database of fully sequenced insect genomes. Nucl. 
Acids Res., 35; D36 - 39. 

Arendsen I, 1920. Technical experiences in the breeding of Tenebrio 
molitor. Proc. Kon. Ned. Akad. Wet. Amst., 23: 193. 

Churbanov A, Ryan R, Hasan N, Bailey D, Chen H, Milligan B, 
Houde P, 2012. HighSSR:; high-throughput SSR characterization 
and locus development from next-gen 
Bioinformatics , 28(21) : 2797 - 2803. 

Dobson AJ, Johnston PR, Vilcinskas A, Rolff J, 2012. Identification of 
immunological expressed sequence tags in the mealworm beetle 
Tenebrio molitor. J. Insect Physiol., 58(12) : 1556 - 1561. 

Gao X, Han J, Lu Z, Li Y, He C, 2012. Characterization of the spotted 


seal Phoca largha transcriptome using Illumina paired-end 


sequencing data. 


sequencing and development of SSR markers. Comp. Biochem. 
Physiol., 7D(3) : 277 -284. 
Goldstein DB, Schlotterer C, 1999. Microsatellites: Evolution and 


145 (GGC)5 S 
144 (TTA)7 S 
142 (GGC)5 S 
135 (GCG)5 S 
154 (AGA)10 S 
124 (TCG)5 S 
148 (AC)8 S 
155 (GAC)5 S 
100 (TTC)6 S 
121 (TCG)5 S 
144 ( ATTT)5 S 
127 (AT)6 S 


Applications. Oxford University Press, Oxford. 

Li YC, Korol AB, Fahima T, Beiles A, Nevo E, 2002. Microsatellites : 
genomic distribution, putative functions and mutational mechanisms; 
a review. Mol. Ecol., 11(12) : 2453 -2465. 

Li YC, Korol AB, Fahima T, Nevo E, 2004. Microsatellites within 
genes: structure, function, and evolution. Mol. Biol. Evol., 21 
(6): 991 - 1007. 

Pölkki M, Krams I, Kangassalo K, Rantala MJ, 2012. Inbreeding affects 
sexual signalling in males but not females of Tenebrio molitor. Biol. 
Lett., 8(3) : 423 - 425. 

Pursall ER, Rolff J, 2011. Immune responses accelerate ageing: proof- 
of-principle in an insect model. PLoS ONE, 6(5) : e19972. 

Tang J, Baldwin SJ, Jacobs JM, Linden CG, Voorrips RE, Leunissen 
JA, van Eck H, Vosman B, 2008. Large-scale identification of 
polymorphic microsatellites using an in silico approach. BMC 
Bioinform., 9; 374. 

Varshney RK, Graner A, Sorrells ME, 2005. Genetic microsatellite 
markers in plants: features and applications. Trends Biotechnol., 23 
(1); 48-55. 

Wang ML, Barkley NA, Jenkins TM, 2009. Microsatellite markers in 
plants and insects. Part I: applications of biotechnology. Genes, 
Genomes and Genomics, 3 (S1 ) ; 54 -67. 

Wei W, Qi X, Wang L, Zhang Y, Hua W, Li D, Lv H, Zhang X, 
2011. Characterization of the sesame (Sesamum indicum L. ) global 
transcriptome using Illumina paired-end sequencing and development 


of EST-SSR markers. BMC Genomics, 12 ; 451. 


728 昆虫 学 报 Acta Entomologica Sinica 56 卷 


Xu Y, Zhou W, Zhou Y, Wu J, Zhou X, 2012. Transcriptome and isolation; a review. Mol. Ecol., 11(1); 1-16. 


comparative gene expression analysis of Sogatella furcifera Zhu JY, Yang P, Zhang Z, Wu GX, Yang B, 2012. Transcriptomic 
(Horvath) in response to southern rice black-streaked dwarf virus. immune response of Tenebrio molitor pupae to parasitization by 
PLoS ONE, 7(4) : 36238. Scleroderma guani. PLoS ONE, 8(1): e54411. 


Zane L, Bargelloni L, Patarnello T, 2002. Strategies for microsatellite 


基于 转录 组 数据 高 通 量 发 据 黄 粉 甲 微 卫星 引物 
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摘要 : BH Tenebrio molitor 作为 理想 的 模式 研究 生物 , 虽然 已 围绕 该 昆虫 在 多 个 研究 领域 开展 了 诸多 研究 , 但 
是 有 关 其 分 子 和 遗传 方面 的 研究 仍 知 之 其 少 。 为 此 , 本 研究 基于 前 期 构建 的 黄粉 甲 转录 组 数据 库 , 成 功 发 掘 获得 
1 249 个 微 卫星 序列 。 其 中 , 单 碱 基 或 三 碱 基 序 重复 列 最 多 , 分 别 占 44.44% 和 41.15% ; A/T 型 重复 序列 出 现 频 
率 最 高 ,， 占 42.70% 。 除 单 核 昔 酸 重复 序列 外 , 重复 单元 的 重复 次 数 以 5 次 最 多 , h 30.90% 。 基 于 鉴定 获得 的 微 
卫星 序列 , 共 设 计 获 得 1 004 对 微 卫星 引物 , 而 且 每 对 引物 还 设计 了 5 对 替代 引物 。 研 究 获得 的 微 卫星 引物 将 有 
助 于 今后 开展 黄粉 甲 功能 和 比较 基因 组 学 方面 的 研究 。 
关键 词 : 黄粉 甲 ; 转录 组 ; 微 卫 星 ; 遗传 标记 ; 引物 
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